Blueprint components and configuration
Blueprints provide a YAML-based framework for defining custom connectors. This structured configuration allows you to connect to REST APIs and build scalable data pipelines without the need for custom scripting.
Blueprint YAML structure
Each Blueprint is composed of the following core components:
| Component | Description | Required |
|---|---|---|
interface_parameters | User-configurable inputs (authentication, filters, dates) | No |
connector | API connection settings (base URL, headers, storage) | Yes |
steps | Workflow logic (REST calls, loops, data extraction) | Yes |
Example: Complete structure configuration
# 1. Interface Parameters - User inputs displayed in River configuration
interface_parameters:
section:
source:
- name: "api_credentials"
type: "authentication"
auth_type: "bearer"
fields:
- name: "bearer_token"
type: "string"
is_encrypted: true
- name: "date_range"
type: "date_range"
period_type: "date"
format: "YYYY-mm-DD"
fields:
- name: "start_date"
value: ""
- name: "end_date"
value: ""
# 2. Connector Configuration - API connection settings
connector:
name: "My API Connector"
base_url: "https://api.example.com/v1"
default_headers:
Content-Type: "application/json"
Accept: "application/json"
default_retry_strategy:
500:
max_attempts: 3
retry_interval: 10
429:
max_attempts: 5
retry_interval: 60
variables_metadata:
final_output_file:
format: "json"
storage_name: "results_dir"
variables_storages:
- name: "results_dir"
type: "file_system"
# 3. Steps - Workflow logic
steps:
- name: "Fetch Data"
description: "Retrieve data from the API"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/data"
query_params:
start_date: "{date_range.start_date}"
end_date: "{date_range.end_date}"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.data"
Connector Configuration
The connector configuration establishes the foundational settings for your API integration.
Configuration fields
| Field | Description | Required |
|---|---|---|
name | Descriptive name for the connector. | Yes |
base_url | Root URL for all API requests. | Yes |
default_headers | Headers sent with every request. | No |
default_retry_strategy | Retry policies for failed requests. | No |
variables_metadata | Variable storage configuration. | Yes |
variables_storages | Storage location definitions. | Yes |
Base URL
The base URL is the root endpoint for your API. All step endpoints are appended to this URL.
connector:
name: "Salesforce Connector"
base_url: "https://mycompany.salesforce.com/services/data/v58.0"
Do not include trailing slashes in the base URL.
Default headers
Headers that should be sent with every API request:
connector:
default_headers:
Content-Type: "application/json"
Accept: "application/json"
X-Custom-Header: "custom-value"
Do not include Authorization headers here - authentication is automatically injected from interface parameters.
Default retry strategy
Configure automatic retries for specific HTTP status codes. Each status code can have a unique maximum attempt count and interval.
- Max attempts: The number of times the connector tries to re-establish the connection.
- Retry interval: The duration (in seconds) to wait between attempts.
connector:
default_retry_strategy:
429: # Rate Limited
max_attempts: 5
retry_interval: 60 # seconds
500: # Internal Server Error
max_attempts: 3
retry_interval: 10
502: # Bad Gateway
max_attempts: 3
retry_interval: 10
503: # Service Unavailable
max_attempts: 3
retry_interval: 30
504: # Gateway Timeout
max_attempts: 3
retry_interval: 10
Variables storage
The variables storage configuration defines how and where the connector stores extracted data during execution.
connector:
variables_metadata:
final_output_file:
format: "json"
storage_name: "results_dir"
intermediate_data:
format: "json"
storage_name: "results_dir"
variables_storages:
- name: "results_dir"
type: "file_system"
Workflow steps
Steps define the execution logic of your connector. Steps execute sequentially. You can use data extracted from one step in subsequent steps.
Step types
The following table describes the available step types.
| Type | Description |
|---|---|
rest | Execute an HTTP request (GET, POST, PUT, PATCH, DELETE). |
loop | Iterates over a data collection, executing nested steps for each element. |
REST step
A REST step executes a single HTTP request.
steps:
- name: "Get Users"
description: "Fetch all users from the API"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/users"
query_params:
status: "active"
limit: "100"
headers:
X-Request-ID: "unique-id"
retry_strategy:
500:
max_attempts: 3
retry_interval: 10
variables_output:
- response_location: "data"
variable_name: "users_list"
variable_format: "json"
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.data.users"
HTTP methods
The http_method determines the action the request performs on the resource.
| Method | Description |
|---|---|
GET | Retrieves data. |
POST | Create data or send payloads. |
PUT | Updates or replaces data. |
PATCH | Applies partial updates to data. |
DELETE | Removes data. |
POST request with body
Use the body field to define the data payload when creating or updating a record.
steps:
- name: "Create Record"
description: "Create a new record via POST"
type: "rest"
http_method: "POST"
endpoint: "{{%BASE_URL%}}/records"
headers:
Content-Type: "application/json"
body:
name: "{{%record_name%}}"
email: "{{%record_email%}}"
status: "active"
variables_output:
- response_location: "data"
variable_name: "created_record"
variable_format: "json"
Loop step
Loop steps iterate over collections of data and execute nested steps for each item.
steps:
# Step 1: Fetch list of account IDs
- name: "Get Account IDs"
description: "Retrieve all account IDs"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/accounts"
variables_output:
- response_location: "data"
variable_name: "account_ids"
variable_format: "json"
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.accounts[*].id" # Extract just the IDs
# Step 2: Loop through each account ID
- name: "Process Each Account"
description: "Fetch details for each account"
type: "loop"
loop:
type: "data"
variable_name: "account_ids"
item_name: "account_id" # Each item IS the ID
add_to_results: true
ignore_errors: false
steps:
- name: "Get Account Details"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/accounts/{{%account_id%}}/details"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
overwrite_storage: false
The item_name represents the entire current item in the iteration. To use specific properties, for example, ID, extract them in the transformation layer so each loop item contains the required value.
Loop configuration options
The following table describes the fields for configuring a loop.
| Field | Description | Required |
|---|---|---|
type | Specifies the loop type: data, date_range, or while | Yes |
variable_name | Identifies the variable containing the array to iterate. | Yes |
item_name | Sets an alias for the current item in the iteration. | Yes |
add_to_results | Includes the loop output in the final results. | Yes |
ignore_errors | Continues the loop if individual items fail. | No |
External variables loop
When a loop is the first step in your workflow, you can iterate over data passed from the source River using the {ext.} syntax:
steps:
- name: "Process External IDs"
description: "Loop through IDs from source River"
type: "loop"
loop:
type: "data"
variable_name: "{ext.source_ids}" # External variable syntax
item_name: "item_id"
add_to_results: true
ignore_errors: true
steps:
- name: "Fetch Item"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/items/{{%item_id%}}"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
overwrite_storage: false
The {ext.} prefix can only be used in the first step of a workflow.
Accessing external dictionary properties
When an external variable is a dictionary (object), you can access its properties using dot notation.
steps:
- name: "Fetch Using External Config"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/users/{{%{ext.config.user_id}%}}"
query_params:
region: "{{%{ext.config.region}%}}"
Dot notation for property access ({{%variable.property%}}) only works with external dictionary variables. For standard loop items, extract the specific values in the transformation layer.
Variable outputs and transformations
Define the variables_output object to specify how the connector handles response data.
Variable output configuration
variables_output:
- response_location: "data" # data, header, or status_code
variable_name: "users_data"
variable_format: "json"
overwrite_storage: false # Append (false) or replace (true)
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.data.users[*]"
Response locations
The following table describes the available source locations for variables.
| Location | Description |
|---|---|
data | Extracts content from the response body. |
header | Extracts values from the response headers. |
status_code | Captures the numerical HTTP status code. |
Transformation layers
Apply transformation layers to modify or filter response data before storing it.
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.data.items[*]"
Supported transformations
The following table lists the available transformation types.
| Type | Description |
|---|---|
extract_json | Extracts specific data using JSONPath syntax. |
extract_csv | Parses incoming CSV data into a usable format. |
to_json | Converts the data into JSON format. |
to_csv | Converts the data into CSV format. |
Common JSONPath patterns
# Direct property
json_path: "$.data"
# Nested property
json_path: "$.data.users"
# All array items
json_path: "$.data.users[*]"
# Specific field from all items
json_path: "$.data.users[*].id"
# Root array
json_path: "$[*]"
# Last item in array
json_path: "$.data[-1].id"
Variable reference syntax
Refer the following syntax patterns to inject dynamic data into your configuration.
| Context | Syntax | Description | Example |
|---|---|---|---|
| Internal variable | {{%variable_name%}} | References data from a previous step. | /users/{{%user_id%}} |
| Loop item | {{%item_name%}} | References the current loop item value. | /orders/{{%order_id%}} |
| Interface parameter | {param_name} | References a user input value. | ?status={status_filter} |
| Date range start | {param.start_date} | References a start date from a picker. | from={dates.start_date} |
| Date range end | {param.end_date} | References an end date from a picker. | to={dates.end_date} |
| External data | {ext.variable_name} | References data from a source River. | {ext.incoming_ids} |
| External dict property | {{%{ext.dict.property}%}} | References a property from a dictionary. | {{%{ext.config.user_id}%}} |
| Base URL | {{%BASE_URL%}} | References the connector base URL. | {{%BASE_URL%}}/users |
Workflow patterns
The following patterns shows common implementation strategies for connector workflows.
Pattern 1: Simple data fetch
Single REST step to fetch data:
steps:
- name: "Fetch All Records"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/records"
pagination:
type: "page"
location: "qs"
parameters:
- name: "page"
value: 1
increment_by: 1
- name: "per_page"
value: 100
break_conditions:
- name: "No More Data"
condition:
type: "empty_json_path"
key_json_path: "$.data"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
Pattern 2: Parent-child relationship
Fetch a list, then get details for each item:
steps:
# Step 1: Get parent records
- name: "Get Organization IDs"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/organizations"
variables_output:
- response_location: "data"
variable_name: "org_ids"
variable_format: "json"
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.organizations[*].id" # Extract just the IDs
# Step 2: Loop through each and get child records
- name: "Get Org Members"
type: "loop"
loop:
type: "data"
variable_name: "org_ids"
item_name: "org_id" # Each item IS the org ID
add_to_results: true
steps:
- name: "Fetch Members"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/organizations/{{%org_id%}}/members"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
overwrite_storage: false
Pattern 3: Sequential API calls
Multiple independent REST steps in sequence:
steps:
# Step 1: Authenticate and get token
- name: "Get Access Token"
type: "rest"
http_method: "POST"
endpoint: "{{%BASE_URL%}}/auth/token"
body:
grant_type: "client_credentials"
variables_output:
- response_location: "data"
variable_name: "auth_token"
variable_format: "json"
transformation_layers:
- type: "extract_json"
from_type: "json"
json_path: "$.access_token"
# Step 2: Use token to fetch data
- name: "Fetch Protected Data"
type: "rest"
http_method: "GET"
endpoint: "{{%BASE_URL%}}/protected/data"
headers:
Authorization: "Bearer {{%auth_token%}}"
variables_output:
- response_location: "data"
variable_name: "final_output_file"
variable_format: "json"
Best practices
-
Naming conventions: Use clear and descriptive names.
Example
# Good
variable_name: "user_profiles"
variable_name: "order_transactions"
variable_name: "final_output_file"
# Avoid
variable_name: "data"
variable_name: "temp"
variable_name: "x" -
Error handling: Define configure retry strategies for common error codes.
Example
retry_strategy:
429:
max_attempts: 5
retry_interval: 60
500:
max_attempts: 3
retry_interval: 10 -
Pagination safety: Include break conditions in all paginated requests to prevent infinite loops.
Example
break_conditions:
- name: "Primary: Empty Data"
condition:
type: "empty_json_path"
key_json_path: "$.data"
- name: "Safety: Page Size Check"
condition:
type: "page_size_break"
page_size_param_name: "limit"
items_json_path: "$.data" -
Loop configuration:
- Set
ignore_errors: trueif the workflow must continue despite individual item failures. - Set
add_to_results: trueto include loop outputs in the final result. - Test workflows with small data sets before executing full production runs.
- Set
-
Security:
- Mark all sensitive fields with
is_encrypted: true. - Never hardcode credentials in YAML.
- Use interface parameters for all authentication.
- Mark all sensitive fields with
Advanced features
The following advanced features are available for complex scenarios. For more information, refer to YAML reference guide.
| Feature | Description |
|---|---|
| PUT/PATCH/DELETE methods | Additional HTTP methods for data modification. |
| Date range loops | Iterates through date chunks. |
| While loops | Repeats an execution block until a specific condition is met. |
| Pre-run configuration | Executes setup steps before the main workflow starts. |
| Multi-report configuration | Generates multiple report outputs from one workflow. |
| Advanced break conditions | Evaluates string equality, numeric values, or compound OR logic. |
| CSV transformations | Parses and converts data between JSON and CSV formats. |
| RFC8288 Link header pagination | Uses standard Link headers to manage paginated API responses. |